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ABSTRACT:  Two  modified  merge  algorithms  for  the  longest  upsequence 
problem  are  presented.  The  algorithms  take  advantage  of  natural  runs  in 
the  input  sequence  and  have  a  worst  case  O(nlogn)  time  complexity  when 
appropriate  merging  techniques  are  used,  but  can  be  linear  if  long  runs 
are  present  in  the  sequence.  The  first  algorithm  is  logically 
equivalent  to  Dijkstra's  algorithm;  the  second  algorithm  is  based  on 
the  first  one  but  uses  a  different  merging  technique.  Concluding 
remarks  describe  how  these  results  evolved  out  of  our  work  in 
programming  by  transformation. 


Longest  upsequence  problem 

Given  a  sequence  of  n  integers,  an  upsequence  is  a  subsequence 
which  is  ordered  in  nondecreasing  order.  A  subsequence  is  any  subset  of 
the  original  sequence  in  which  the  original  order  is  retained  (hence 
there  are  2**n  possible  subsequences.) 


The  problem  we  consider  is:  Give  an  algorithm  which,  given  a 
sequence,  will  return  the  length  of  its  longest  upsequence.  (Note  that 
there  may  be  more  Chan  one  longest  upsequence  of  the  same  length.) 

Diikstra^s  algorithm 

Dijkstra's  algorithm  [Di]  examines  elements  in  the  original 
sequence  from  left  to  right.  When  a  'next'  element  is  examined  it  is 
inserted  into  what  Dijkstra  denotes  as  the  m  array  (consisting  of 
minimum  rightmost  elements  of  upsequences  of  length  1,2,...).  (Here  we 
will  refer  to  the  'm  array'  as  the  'target  sequence'.)  New  sequence 
elements  x  are  inserted  into  the  target  sequence  either  by  being  added 
to  the  right  end  (if  x  is  larger  than  the  rightmost  element  of  the 
partial  target  sequence)  or  by  'bumping'  an  element  already  in  the 
target  sequence.   If  x  =  A[n]  is  the  next  element  of  the  input  sequence, 
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the  heart   of  the  Dijkstra  algorithm  is: 

if  m[k]    <=  A[n]    then  k    :=  k+1;    m[k]    :=  A[n] ; 
elseif  A[n]    <  m[l]    then  in[l]    :=  A[n]  ; 
else   "establish  j    such   that 

m[j-l]    <=  A[n]    <  in[j]"; 
m[j]    :=  A[n]  ; 
end  if; 

Dikstra  uses  a  binary  search  technique  to  determine  j ,  which  makes  the 
complexity  of  his  algorithm  O(nlogn). 

First  Modified  Merge  Algorithm, 

The  algorithm  presented  here,  although  logically  equivalent  to 
Dijkstra's  algorithm,  uses  a  fresh  and  more  efficient  approach.  Rather 
than  examining  the  input  sequence  element  by  element  in  a  left  to  right 
manner,  the  algorithm  proceeds  as  follows: 

1)  Divide  the  input  sequence  iiitc  natural  ascending  runs 
r I , r2 ,  • e  J  rm; 

2)  Merge  runs  in  a  left  to  right  manner  using  the  modified 
merge  operation  detailed  below  (The  order  of  merges  is  as 
follows:  merge  rl  with  r2;  merge  the  resulting  run  with  r3; 
continue  merging  in  this  manner  r4,».,rm  with  the  leftmost  run.) 

3)  output  the  length  of  the  final  merged  sequence,  which  will 
be  equal  to  the  length  of  the  longest  upsequence. 

Our  nonstandard  merge  of  two  adjacent  runs  operates  in  a  manner  rather 
similar  to  standard  merging.  However,  whenever  the  next  member  p  of  the 
right  run  is  smaller  than  the  next  member  q  of  the  left  run,  p  bumps  q, 
i.e.  p  is  copied  into  the  merged  run  but  q  is  not,  and  we  advance  one 
step  in  both  runs  to  continue  the  merging- 

This  procedure  maintains  the  following  invariant  proparty  (which  shows 
it  to  be  equivalent  to  Dijkstra's  algorithm):  The  j-th  element  from  the 
left  in  the  merged  sequence  is  always  the  smallest  rightmost  element  of 
an  upsequence  of  length  j  in  the  portion  of  the  original  sequence  which 
has  entered  so  far  into  the  merge. 
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The  following  program,  "longest_upseq",  uses  the  modified  merge   to 
solve  the  longest  upsequence  problem. 


program  longest_upseq; 

loop  do 

read(d) ; 

print (#merge_runs (create_runs (d) ) ) ; 
end  loop ; 

procedure  create_runs  (d) ; 

return  if  exists  i  in  [I..#d-1]  |  d^Q;  >  4Ci+l)  then 
[d(l..i)]  +  create_runs  (d(i+l..)) 
else  [d]  end;  .s.-'^o   ,•  n; 

end  procedure;  dai!i5  :  a;-  -■  . 

procedure  merge_runs  (rs);      -a  t..  '.  ■-- 
return  if  //rs  =  I  then  rs(l) 

else  raerge_runs(  [merge  (irs(l.)v*s)(f2^))  ]•  +  rs(3..))  end; 
end  procedure; 

procedure  merge  (a,  b) ; 

return  if  a  =  []  then  b  . ._   . 

elseif  b  =  []  then  a       . -j  - 

elseif  a(l)  <=  b(l)  then.a-d...  1)  +  merge  (a(2..),  b) 
else  b(l..I)  +  merge  (a (2..),  b(2..))  end; 

end  procedure;  - 

end  program; 

a 3 'j js r iii;.  :■-  .  ■. i 

Consider  the  following  example.       :.    -j:-:'  r.^ 

Let  the  input  sequence  be  :^»'rr.- 

13578294    10   6 
After   identifying   the  runs  we  get 

13578/29/4    10/6 
Merge   rl  with   r2: 

125789/4    10/6 
Merge   the  new   run  with   r3: 

124789    10/6 
Merge  with   r4: 

1  2  4  6  8  9  10 
The  length  of  the  longest  upsequence  is  therefore  7. 
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Complexity  of  the  Algorithm 

The  merge  algorithm  presented  here  can  certainly  be  made  O(nlogn). 
Each  merge  could  be  accomplished  with  k  binary  insertions  where  k  is  the 
length  of  the  right  hand  run.  In  this  case  the  algorithm  is  at  least  as 
good  as  Dijkstra's.  Moreover,  merging  runs  by  binary  insertion  (rather 
than  inserting  individual  elements)  gives  better  actual  performance 
since  the  target  sequence  decreases  as  successive  elements  are  inserted, 
and  since  "tails"  of  runs  can  be  appended  in  constant  time. 

It  is  of  interest  to  consider  a  sequence  which  is  made  up  of  a 
small  number  of  very  long  natural  runsc  In  this  case,  while  merging 
done  by  binary  insertion  will  produce  an  O(nlogn)  algorithm,  ordinary 
tape  merging  will  produce  an  0(n)  algorithm.  In  fact,  a  random  sequence 
will  have  both  long  runs  and  short  runs.  While  short  runs  (say  of 
length  1  or  2)  are  best  merged  by  binary  insertion,  long  runs  are  best 
merged  by  tape  merging. 

A  hybrid  merging  algorithm  such  as  the  Hwang-Lin  binary  merging 
algorithm  [Kn]  could  be  used  to  advantage  here.  In  their  analysis  Hwang 
and  Lin  [HL]  show  that  the  maximum  number  of  compares  needed  to  merge  m 
elements  with  n  elements,  K(m, n),  using  their  algorithm,  is  equal  to 
that  needed  by  binary  insertion  for  m=»I  or  m=2  and  is  less  than  that 
needed  by  binary  insertion  for  m>2.   That  is, 

[Hwang-Lin]    K(m,n)    =    [binary   insertion]    K(m,n)    =  ceil(log(n+l) ) 
for  ra=l   or  m=2 

and 

[Hwang-Lin]  K(m,n)  <  [binary  insertion]  K(m,n)  for  m>2. 

Also 
[Hwang-Lin]  K(m,n)  =  [tape  merge]  K(m,n)  =  m+n-1  for  m~n. 

Clearly  the  modified  merge  algorithm  for  the  longest  upsequence 
problem  using  the  Hwang-Lin  binary  merge  (or  some  variant  such  as  one  of 
those  proposed  by  Manacher  [Ml] ,  [M2] )  would  give  slightly  better  worst 
case  performance  than  Dijkstra's  algorithm.  Since  the  expected  time  for 
the  Hwang-Lin  algorithm  would  also  be  better  than  that  for  binary 
insertion,  our  modified  merge  would  also  enjoy  better  average  case 
performance  (however,  we  do  not  present  a  formal  analysis  here.) 

It  is  also  interesting  to  note  that  (to  our  knowledge)  a  precise 
lower  bound  order  for  the  longest  upsequence  problem  is  not  known.  One 
would  either  expect  future  investigations  to  show  that  this  bound  is 
O(nlogn),  or  else  reveal  faster  algorithms  in  which  merging  techniques 
might  play  an  important  role. 
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Second  Modified  Merge  Algorithm 


Next  we  consider  an  interesting  variant  of  the  first  algorithm.  We 
first  observe  that  we  can  get  a  'mirror-image'  of  the  preceding 
algorithm  by  merging  runs  from  right  to  left,  going  through  each  run  in 
descending  order  (also  from  right  to  left)  and  letting  larger  elements 
from  the  left  run  bump  smaller  elements  in  the  right  run. 


Applied  to  the  same  example  shown  above,  this  variant  will  work  as 
follows: 

We  begin  by  partitioning  the  sequence  into  runs  as  before: 

1  3  5  7  8  /  2  9  /  4  10  /  6   .f f^',;! 
We  first  merge-to-the-right  r3  with  r4: 

13578/29/4    10 
Then  we  merge   r2  with   the   new  run: 

13578/29    10 
And    finally  merge   rl: 

1  3  5  7  8  9  10 
Again,  the  length  of  the  longest  upsequence  is  7 


: c •  .   'It 


In  analogy  with  the  invariant  maintained  by  Dijkstra's  algorithm 
and  by  our  first  algorithm,  the  reverse  algorithm  maintains  the 
following  'reverse'  invariant: 

Let  m  denote  the  target  sequence  constructed  by  the  reverse  algorithm. 
Let  k  be  the  j'th  element  from  the  right  of  m.  Then  k  is  the  largest 
leftmost  element  of  an  upsequence  of  length  j  in  the  original  sequence. 


None  of  this  changes  our  basic  approach.   But  now  we  propose  to  mix 
the  two  algorithms  as  follows: 

1)  Partition  the  set  of  all  runs  into  two  equal  parts, 
a  left  one  containing  the  leftmost  half  of  the  runs, 
and  a  right  one,  containing  the  righmost  half. 

2)  Apply  the  first  algorithm  to  the  left  part,  and  the 
reverse  algorithm  to  the  right  part,  to  obtain  two 
final  runs. 

3)  Merge  these  two  runs  by  using  either  the  first  algorithm 
or  its  reverse  counterpart. 

4)  The  length  of  the  resulting  run  is  then  the  length  of  the 
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longest  upsequence. 

Considering  the  same  example  as  before,   we  again  have   the   following 
partition  into  runs : 

13573/29/4  10/6 

We  merge  rl  with  r2  'to  the  left',  and  merge  r3  with  r4  'to  the  right' 
yielding: 

125789/4  10 
Now  we  merge  these  runs,  say,  to  the  right: 

1  2  5  7  8  9  10    ^^.^^.,^, 
and  obtain  7  as  the  length  of  the  longest  upsequence. 


The  correctness  of  this  hybrid  algorithm  is  more  difficult  to 
establish  than  that  of  t;he  p,receding  algorithms.  It  is  proved  as 
follows:  .,---  —  - 


Let  Ul,  U2  «..  Us  be  the  target  sequence  obtained  by  left -merging 
the  left  half  of  the  runs,  and  let  VI,  V2  . . .  Vt  be  the  target  sequence 
obtained  by  right-merging  the  right  half  of  the  runs,  but  written 
backwards,  so  that  VI  is  its  rightmost  element,  and  Vt  is  the  leftmost. 
Written  this  way,  U  is  increasing,  whereas  V  is  decreasing. 

These  sequences  have  the  following  properties: 

(A)  For  each  i  in  [l..s],  Ui  is  the  smallest  rightmost  member  of  an 
upsequence  of  length  i  contained  in  the  left  half  of  the  original 
sequence  (i.e.  in  the  concatenation  A-  of  the  leftmost  half  of  the 
runs),  and  there  does  not  exist  in  A-  an  upsequence  of  length  >  s- 

(B)  For  each  j  in  [l..t],  Vj  is  the  largest  leftmost  member  of  an 
upsequence  of  length  j  contained  in  the  right  half  A+  of  the  original 
sequence;   A+  does  not  contain  an  upsequence  of  length  >  t. 

Now  let  L  be  an  upsequence  contained  in  the  whole  of  the  original 
sequence  A,  having  p  elements  in  A-  and  q  elements  in  A+.   Write  L  as 

LI,  L2  ...  Lp,  L'q  ...  L'2,  L'l 
Then,  by  properties  (A)  and  (B),  we  have  p  <=  s,  q  <=  t  and 

Li  >=  Ui   ,   for  i  in  [1  ..  p] , 

L'j  <=  Vj  ,  for  j  in  [1  ..  q] . 
But  since  L  is  increasing,  we  must  have 

Ui  <=  Li  <=  L'j  <=  Vj   ,  for  i  and  j  as  above. 
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and  in  particular 

Up  <=  Vq. 

Consequently,  if  for  some  p  and  q  the  last  inequality  fails  to  hold, 
there  cannot  exist  an  upsequence  in  A  having  p  elements  in  A-  and  q 
elements  in  A+.  The  converse  statement  is  also  true.  Indeed,  suppose 
that  the  last  inequality  does  hold.  Then,  by  properties  (A)  and  (B), 
there  exists  a  p-upsequence  in  A-  ending  in  Up,  and  a  q-upsequence  in  A+ 
starting  in  Vq.   Their  concatenation  is  the  desired  upsequence. 

We  have  thus  proved  the  following 

LEMMA:  There  exists  an  upsequence  in  A  having  p  elements  in  A-  and  q 
elements  in  A+  if  and  only  if  p  <=  s,  q  <=  t  and  Up  <=  Vq. 

Let  us  now  suppose  that  we  have  left-merged  the  sequences  U  and  V  to 
obtain  a  final  target  sequence  W.  Note  that  our  bumping  merge  is  such 
that  the  length  r  of  W  is  at  least  m,  and  at  moat  m+n.  We  can  thus 
write  r  =  m  +  k,  where  k  in  [o.  .n]  .  Mo'feoveif,  the  k  rightmost  elements 
of  W  (reading  from  right  to  left)  must  be 

VI,  V2  ...  Vk. 

Let  us  now  show  that  A  contains  an  upsequence  of  length  ra  +  k.   Write  W 

as  '  "■ 

Wl ,  W2  ...  Wm,  Vk  ...  V2 ,-  VI'  •  '  -   " ' 

Scan  the  W's  from  right  to  left  starting  at  Wm,  till  a  member  belonging 
to  the  U  sequence  is  found.  It  is  easily  seen  that  elements  of  U  that 
pass  into  the  merged  sequence  retain  th'^ir  '"  original  positions  in  the 
target  sequence.  Let  the  first  el^metct  of  U"  found  in  W  be  Ui  =  Wi. 
This  means  that  the  next  element  to  its  right  Wv(k-hii-i).  Thus  we  have 
Ui  <=  V(k+m-i),  so  that  by  the  lemma  there  exfi^sts  in  A  an  upsequence  of 
length  k+m. 

Next  we  show  that  there  cannot  exist  a  longer  upsequence  in  A.  Indeed, 
suppose  that  there  exists  a  (k+m+l)  upsequence.  Since  it  cannot  have 
more  than  m  elements  in  A-,  it  must  have  at  least  k+1  elements  in  A+. 
Suppose  that  it  has  k  +  h  elements  in  A+.  By  (B),  its  leftmost  element 
must  be  <=  V(k+h) .  However,  V(k+1)  ..  V(k+h)  have  all  bumped  some 
elements  of  U,  so  that  V(k+h)  must  have  bumped  an  element  of  U  which  is 
at  least  h  places  from  the  right  end  of  U,  and  so  is  <=  U(m-h+l).  All 
this  implies  that 

V(k+h)  <  U(m-h+l) 

so  that  by  the  lemma  such  an  upsequence  cannot  exist.  This  proves  our 
claim. 
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For  the  sake  of  completeness,   here  is   the  hybrid  algorithm  in 
detail. 

program  longest_upseq; 

loop  do 

read(d) ; 

print (#merge_runs (create_runs (d) ) ) ; 
end  loop ; 

procedure  create_runs  (d) ; 

return  if  exists   i   in    [l..#d-l]     |    d(i)    >  d(i+l)    then 
[d(l..i)]    +  create_runs    (d(i+l..)) 
else    [d]    eiid;     e    -  s- 
end   procedure; 

procedure  merge_r'jns    (xs);'itt''; 
return  ,(-.j. --r;.c 

case  #rs   of,.--  «.o'  - 

( 0 ) :    [  ]  ,        '<    170.1 3  fiiE-  0--  ana-  w 

..■C        if"  ' 

(1):    rs(l), 

(2):    left_!aerge(rsXI),-  rs(2)), 

(3):    left_nergG(r8(l),    right_merge(rs(2) ,rs(3) ) ) 

else  merge__rxins  ( [left_merge(rs(l)  ,rs  (2) )  ]    + 
rs(3..#rs-2)   + 

[right_mc.rge(rs(#rs-I)  ,rs(#rs))] ) 
end;     ■. :   >;   .ai-iiq 
end  procedure;    ?...■?•.  ■;i'?:iS' 

procedure   left_merge    (a,    b) ; 
return  if  a  =    []    then  b 

elseif  b   =    []    then  a 

elseif  a(l)    <=  b(l)    then  a(l..l)   +  left_merge    (a(2..),    b) 

else  b(l..l)   +  left_merge    (a(2..),    b(2..))    end; 

end  procedure; 

procedure   right_merge(a,    b) ; 
return  if  b  =    []    then  a 

elseif  a  =    []    then  b 
elseif  b(#b)    >=  a  (//a)    then 

right_merge(a,    b(l..#b-l))   +    [b(#b)] 
else   right_merge(a(l..#a-l),    b(l..#b-l))   +    [aCZ/a)]    end; 
end  procedure; 

end  program; 
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Discussion 


Our  approach  via  a  modified  merge  makes  the  longest  upsequence 
problem  a  not-so-distant  relative  of  sorting.  The  striking  similarity 
between  the  modified  merge  and  the  standard  merge  used  in  sorting  is 
shown  by  the  fact  that  in  the  procedure  "merge"  shown  above,  a  change 
from  a(2..)  to  a(l..)  in  the  last  line  of  the  if-expression  would  turn 
our  longest  upsequence  algorithm  into  a  standard  sort-by-merging 
algorithm.  It  is  possible  that  further  investigation  of  these 
similarities  would  yield  other  interesting  approaches  to  the  longest 
upsequence  problem.  Note  that  Dijkstra's  algorithm  stands  in  complete 
analogy  to  the  binary-insertion  sort.  (In  fact  it  is  the  only 
sequential  binary  insertion  technique  which  does  not  require  us  to  move 
the  sorted  elements,  because  of  the  bumping  used,  and  therefore  attains 
its  high  efficiency.) 

The  symmetries  between  the  algorithms  for  the  upsequence  problem 
and  the  algorithms  for  sorting  are  of  particular  interest  to  us  in  the 
context  of  our  work  in  transformational  programming  [DS] ,  [Sh] .  In 
particular  we  have  been  investigating  the  role  of  the  high  level 
specification  in  programming  by  transformation  [Me].  We  have  found  that 
it  is  often  the  case  that  more  than  one  high  level  specification  can  be 
given  for  a  particular  problem.  Moreover,  different,  well  chosen 
specifications  will  lead  to  different  derived  algorithms.  We 
hypothesize  that  in  some  cases  the  specification  style  ,or  pattern,  has 
implicitly  imbedded  within  it  some  algorithm  synthesis  construct  or 
paradigm  which  determines  the  structural  characteristics  of  the  derived 
algorithms.  One  such  construct  is  the  "divide  and  conquer  paradigm" 
discussed  by  Green  and  Barstow  [GB] .  This  paradigm  refers  to  the  notion 
of  how  an  algorithm  "splits"  its  input  in  order  to  process  it.  Two 
alternative  methods  of  splitting  the  input  ^re  the  "singleton  split"  and 
the  "equal-size  split"  (which  we  prefer  to  call  the  "partitional 
split".)  In  sorting,  selection  and  insertion  sorts  are  representative  of 
the  singleton  method,  while  quick  and  merge  sorts  are  representative  of 
the  partitional  method. 

Our  merge  approaches  to  the  longest  upsequence  problem  grew  out  of 
our  systematic  attempt  to  apply  the  partitional  method  of  splitting  to 
this  problem.  (Note  that  Dijkstra's  algorithm  represents  the  singleton 
method.)  It  has  been  gratifying  to  discover  algorithms  which  demonstrate 
the  splitting  symmmetry  within  the  longest  upsequence  problem,  as  well 
as  to  discover  the  analogy  between  the  sorting  algorithms  and  the 
longest  upsequence  algorithms  which  was  described  above.  The  following 
tree  summarizes  these  symmetries. 
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Divide  and  Conquer  Paradigm 


singleton  split        partitional  split 

sorting     |    upseqence    sorting     |    upsequence 

I  I  I  1 

insertion  Dijkstra's  merge  modified 

sort  algorithm  sort  merge 

<=>  <=> 

(alter  one  instruction)       (alter  one  instruction) 

Finally,  it  was  particularly  gratifying  to  find  that  the  algorithm 
which  emerged  for  the  longest  upsequence  problem,  as  a  result  of  our 
methods,  was  slightly  better  than  the  algorithm  previously  given  by 
Dijkstra. 
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